A visual concomitant of the Lombard reflex
نویسندگان
چکیده
The aim of the study was to examine how visual speech (speech related head and face movement) might vary as a function of communicative environment. To this end, measurements of head and face movement were recorded for four talkers who uttered the same ten sentences in quiet and four types of background noise condition (Babble and White noise presented through ear plugs or loud speakers). These sentences were also spoken in a whisper. Changes between the normal and in-noise conditions were apparent in many of the Principal Components (PCs) of head and face movement. To simplify the analysis of differences between conditions only the first six movement PCs were considered. The strength and composition of the changes was variable. Large changes occurred for jaw and mouth movement, face expansion and contraction and head rotation in the Z axis. Minimal change occurred for PC3 (rigid head translation in the Z axis). Whispered speech showed many of the characteristics of speech produced in noise but was distinguished by a marked increase in head translation in the Z axis. Analyses of the correlation between auditory speech intensity and movement under the different production conditions also revealed a complex pattern of changes. The correlations between RMS speech energy and the PCs that involved jaw and mouth movement (PC1 and 2) increased markedly from the normal to in-noise production conditions. An increase in the RMS and movement correlation also occurred for head Z-rotation as a function of speaking condition. No increases were observed for the movement associated with head Z-translation, lip protrusion or mouth opening with face contraction. These findings suggest that the relationships underlying Audio-Visual speech perception may differ depending on how that speech was produced.
منابع مشابه
Investigating the role of the Lombard reflex in visual and audiovisual speech recognition
This study focuses on the analysis of the Lombard effect in visual and audiovisual speech recognition. Previous studies have shown that the performance of an audio-only automatic speech recognizer decreases in noisy environments because of the Lombard reflex. A few studies have considered the visual changes due to the Lombard reflex, but the role of the Lombard reflex in automatic visual speech...
متن کاملThe Lombard effect
How does it work? Although the adjustment of vocal intensity happens involuntarily when background noise levels change, the phenomenon is not truly a reflex. Much of what we do know about how the Lombard effect works at a neural level comes from comparative work on non-human primates and other mammals. From these studies we learn that the essential circuits responsible for the Lombard effect ar...
متن کاملThe Lombard effect: a reflex to better communicate with others in noise
To study the Lombard reflex, more realistic databases representing real-world conditions need to be recorded and analyzed. In this paper we 1) summarize a procedure to record Lombard data which provides a good approximation of realistic conditions, 2) present an analysis per class of sounds for duration and energy of words recorded while subjects are listening to noise through open-ear headphon...
متن کاملInfluence of the speaking style and the noise spectral tilt on the lombard reflex and automatic speech recognition
To study the Lombard reflex, more realistic databases representing real world conditions need to be recorded and analyzed. In this paper we 1) propose a procedure to record Lombard data which provides a good approximation of realistic conditions and 2) present a comparison between two sets of experiments where subjects are in communication with a device while listening to noise through open-ear...
متن کاملInvestigating the role of the Lombard reflex in non-audible murmur (NAM) recognition
In this paper, we report non-audible murmur (NAM) recognition results in noisy environments and investigate the effect of the Lombard reflex on non-audible murmur recognition. Non-Audible murmur is speech uttered very quietly and captured through body tissue by a special acoustic sensor (e.g., NAMmicrophone). A system based on non-audible murmur recognition can be applied in cases when privacy ...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2005